Growth Curves

Julin Maloof

Objectives

Do the different populations grow at different rates?

  • How do we do this?
  • What are some challenges?

If growth were simple

\[ height = Intercept + slope * time + Error \\ height = Intercept + slope_{pop} * time + Error \]

What do we do with this?

Need a mathematical equation that produces these types of curves

Weibull to the rescue

There are actually a dozen or more equations that fit this type of data. Some are listed here

We will use the Weibull model

\[ y(t) = \alpha - \beta \cdot e^{-(k \cdot t)^\delta} \] This models height (y) at time \(t\) as a function of parameters \(\alpha\), \(\beta\), \(k\), and \(\delta\)

Lets play around to see what each of these does

Weibull function in R

weibull <- function (t, alpha, beta, k, delta) {
  result <- alpha - (alpha - beta) * exp(-(k * t)^delta)
  return(result)
}

growth <- tibble(time = seq(0,100,.1)) %>%
  mutate(model1 = weibull(t = time,
                          alpha = 60,
                          beta = 10,
                          k = .02,
                          delta = 5))

Exercise 1

  1. Plot the results of this code
  2. adjust the parameters to determine how each one affects the curve. Only change one at a time
weibull <- function (t, alpha, beta, k, delta) 
{
  result <- alpha - (alpha - beta) * exp(-(k * t)^delta)
  return(result)
}

growth <- tibble(time = seq(0,50,.1)) %>%
  mutate(model1 = weibull(t = time,
                          alpha = 100,
                          beta = 10,
                          k = .05,
                          delta = 3))

Exercise 2

Use the cleaned data sent by Julin to generate this plot:

The next problem: Parameter Estimation

Parameter estimation

  • We need a statistical model that fits our data.
  • Parameters are the model coefficients that are tuned to achieve a fit between a statistical model and data.
  • For a linear regression the model parameters are the Intercept(s) and slope(s)
  • In the plot below we might want a model that describes the relationship between engine size and m.p.g.
  • I show two different models. Which fits the data better?
  • What is the statistical criteria for judging fit?

Judge fit by SSE

  • A classic way to judge goodness of fit is by the sum squared error (SSE):

\(\sum\limits_{i=1}^N(obs_i - predicted_i)^2\)

  • OK, but how do we find the parameters the minimize SSE?
  • For linear regressions this can be mathematically solved
  • For non-linear models, such as the Weibull growth model, it is essentially trial and error.
  • The computer picks a set of parameter values, calulates SSE, and then picks a new set.
  • If the new set is better than the old set, keep the new values
  • repeat until SSE is stable
  • There are very sophisticated algorithms to do this.

Let’s give it a try!

tm2 <- tdb %>% filter(pop=="TM2") %>%
  mutate(day=as.numeric(survey_date-min(survey_date)+30),
         mf=factor(mf))

m1 <- nls(height_cm ~ weibull(day, alpha, beta, k, delta),
          data=tm2,
          start = c(alpha=60, beta=1, k = .02, delta = 5), 
          lower = c(20, 0, 0, 2),
          upper = c(120, 20, .5, 10),
          trace = TRUE, 
          algorithm = "port")
  0:     110855.43:  60.0000  1.00000 0.0200000  5.00000
  1:     47332.913:  46.1813  0.00000 0.0188439  2.77590
  2:     28882.632:  39.3578  0.00000 0.0158662  2.00000
  3:     22084.278:  43.0125  0.00000 0.0128837  2.27667
  4:     16972.307:  45.0951  0.00000 0.0116162  3.33645
  5:     13636.271:  51.6978  0.00000 0.0102960  3.86786
  6:     10786.422:  56.7693  8.30227 0.00970057  5.85156
  7:     10295.173:  56.1451  7.24880 0.00999914  7.06968
  8:     10255.319:  56.9993  7.39982 0.00989561  7.26822
  9:     10253.209:  56.8254  7.48593 0.00991289  7.43640
 10:     10253.083:  56.8511  7.51245 0.00990859  7.45918
 11:     10253.076:  56.8438  7.51932 0.00990896  7.46931
 12:     10253.075:  56.8441  7.52125 0.00990880  7.47120
 13:     10253.075:  56.8438  7.52172 0.00990881  7.47181
 14:     10253.075:  56.8438  7.52185 0.00990880  7.47195
 15:     10253.075:  56.8438  7.52188 0.00990880  7.47198

look at the result and make predictions

m1
Nonlinear regression model
  model: height_cm ~ weibull(day, alpha, beta, k, delta)
   data: tm2
    alpha      beta         k     delta 
56.843802  7.521879  0.009909  7.471984 
 residual sum-of-squares: 20506

Algorithm "port", convergence message: relative convergence (4)
tm2 <- tm2 %>% mutate(pred=predict(m1))

plot it

tm2 %>%
  ggplot(aes(x=day)) +
  geom_line(aes(y=height_cm, group=plantID), alpha = 0.3) +
  geom_line(aes(y=pred), lwd=2, color="red") 

problems

  • Need to compare between populations
  • Want p-values, statistical inference